Nonparametric Bayesian Inference on 
Bivariate Extremes 

Simon Guillotte* Francois PerronJ and Johan Segers* 
November 17, 2009 

Abstract 

The tail of a bivariate distribution function in the domain of attraction of a bi- 
variate extreme-value distribution may be approximated by the one of its extreme- 
value attractor. The extreme-value attractor has margins that belong to a three- 
parameter family and a dependence structure which is characterised by a spectral 
measure, that is a probability measure on the unit interval with mean equal to one 
half. As an alternative to parametric modelling of the spectral measure, we propose 
an infinite-dimensional model which is at the same time manageable and still dense 
within the class of spectral measures. Inference is done in a Bayesian framework, 
using the censored-likelihood approach. In particular, we construct a prior distri- 
bution on the class of spectral measures and develop a trans-dimensional Markov 
chain Monte Carlo algorithm for numerical computations. The method provides a 
bivariate predictive density which can be used for predicting the extreme outcomes 
of the bivariate distribution. In a practical perspective, this is useful for comput- 
ing rare event probabilities and extreme conditional quantiles. The methodology is 
validated by simulations and applied to a data-set of Danish fire insurance claims. 

Keywords. Bayes, Bivariate Extreme Value Distribution, Extreme Conditional Quantiles, 
MCMC, Prediction, Rare Event Probabilities, Reversible Jumps, Spectral Measure. 

1 Introduction 

In areas such as engineering or financial risk management, decisions have to be made 
which depend on the extreme outcomes of two or more variables. Particular examples 
of interesting questions may be: how high should a dike be in order to withstand ex- 
ceptionally high levels of a river at several sites simultaneously ? How much capital 
to set aside in order to have sufficient reserve in times of financial crises, affecting the 
values of multiple financial securities at once ? 

These kind of problems require inference on the distribution of bivariate (or more 
generally multivariate) extremes. In particular, we are interested in the bivariate density 
of a random pair (X\,X2) on a quadrant [u\, oo) x [U2, oo), where u\ and 112 are large 
thresholds, that is V(X\ > u\) and P(^2 > ui) are positive but small. This density 
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can not only be used for the computation of rare-event probabilities over joint tail 
regions, but also of extreme conditional quantiles of one variable given an extreme 
outcome in the other variable. For instance, it may be of interest to find a level X2 
such that for a given probability p > (small) and a given value of x\ (large) we have 
P(Z2 > x-i | X\ = X\) = p. Up to the best of our knowledge, the latter type of problem 
has not yet been properly addressed in the extreme-value literature. 

More specifically, let (Xh,Xq), i = 1, ...,«, be a random sample from a bivariate 
distribution F in the max-domain of attraction of a bivariate extreme-value distribution 
G. Except for the case of asymptotic independence, the (bivariate) tail of F is well 
approximated by the one of G. The margins of the latter distribution are characterised 
by three parameters each, for a total of six parameters. We shall refer to these as 
the tail parameters. In addition, the dependence structure of G is characterised by 
a spectral measure, which can be any probability measure on the unit interval [0, 1] 
with mean 1/2. Thus, an approximation formula for F in a bivariate tail region is 
obtained by the specification of six marginal tail parameters and a spectral measure. 
It follows that approximate inference on the tail of F may be done via inference on 
these parameters. Inference on the spectral measure may be done within parametric 
families, see for instance Boldi and Davison] ( |2007 1 
|de Haan et al. 



Coles and Tawn (1991 



( |2008) ; |Einmahl et aJT| pOOg] ); |Joe et al.| ( fl992l >; |Ledford and Tawn] 



1994); 



( |1996| l; Smith | (l994|). Alternatively, one may prefer to proceed nonparametrically, see 
for instance [de Haan and de Ronde| ( |199"8l ); |de Haan and Sinha| ( [T999) ; |Einmahl et al.] 
( |200T][2006) ; |Einmahl and Segers| ( [2009l >; |Schmidt and Stadtmiiller| ( p006) . Surveys 



of these methods can be found for instance in the monographs [Coles (2001 ); Beirlant 



et al.|l|2004);|de Haan and Ferreira| ( |2006) ; |Kotz and N adarajah ( |2000) ; |de Haan and 
Ferreira|P006] i~ 



Data about extreme events being scarce by nature, the statistical uncertainty in 
extreme-value analysis is quite substantial. The question is how to deal with this un- 
certainty for practical purposes. A typical such purpose is prediction, the task being to 
compute a high return level, that is a level which is exceeded once, on average, during a 
long future time interval. The uncertainty due to the random nature of future outcomes 
then has to be combined with the statistical uncertainty on the parameter estimates. In 
a frequentist setting, it may be unclear how to do so: should one use a high quantile's 
point estimate or rather the upper bound of a certain confidence interval ? As argued for 
instance in |Coles and Tawn] ( |1996[ ) and |Coles and Tawn] (|2"005 ), the Bayesian approach 
via the predictive density ( Aitchison and Dunsmore[ |1975| l seems more coherent. How- 
ever, extreme-value dependence structures are essentially infinite-dimensional. There- 
fore, nonparametric Bayesian methodology for multivariate extremes should be further 
developed. Our paper aims to take a step in that direction. 



Essentially, we extend the censored-likelihood method developed in Ledford and 



Tawn ( 19961 to the case of arbitrary (i.e. infinite dimensional) spectral measures in a 



Bayesian setup. To this end, we select a prior on the six marginal tail parameters and 
on the set of spectral measures. Via the censored likelihood, the joint posterior of the 
tail parameters and the spectral measure is computed and converted into a posterior dis- 
tribution for the tail quantities of interest. The actual inference based on the posterior 
is performed by means of a trans-dimensional Markov chain Monte Carlo algorithm; 
see for instance Guillotte and Perron (2008) for similar work in the context of annual 
maxima, that is when data can be modelled directly by a bivariate extreme-value distri- 
bution. Our methodology enables the evaluation of the predictive density in a bivariate 
tail region, which can be used, for instance, for prediction of high future levels of one 
variable given such outcomes of the other one. 
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The prior selection for the spectral measure is the more delicate part, and this is the 
main contribution of the paper. We need to put a prior on 3^, where ^ is the set of 
all cumulative distribution functions on [0, 1] with mean 1/2. The actual prior will be 
concentrated on a certain subset c of spectral measures. The latter subset is 
constructed in such a way that for any desired precision e > and for any H e JSC, 
there exists H' e JS%" with 

sup \H'(w) - H(w)\ < e. 

we [0,1] 

At the same time, the set 3%" needs to be sufficiently manageable for actual compu- 
tations. The spectral measures in our construction are parametrised by a (constrained) 
vector, the components of which correspond to their atoms at and 1, and to a finite 
number of points in (0, 1). To each vector in the parameter space corresponds a spectral 
measure, specified via a cubic spline interpolation, which is absolutely continuous on 
(0,1). 

The outline of the paper is as follows. Since the model provides an approximation 
of the bivariate tail of F only, special care has to be taken on how to define the likeli- 
hood of the parameters given the data (Section^. The construction of the subspace of 
spectral measures ffl" is done in Section [5] The prior selection, the Bayesian frame- 
work, and the MCMC algorithm used for numerical computations are detailed in Sec- 
tion]?] In Section [5] we present the results of a simulation validating our methodology 
using artificial data and we analyse a data-set of Danish fire insurance claims, which 



has been previously studied in a univariate context by McNeil ( 1997]) and |Resnick 
(1997). Finally, we conclude with a discussion in Section|6] 



2 Modelling bivariate tails 

The domain-of-attraction condition on a bivariate cumulative distribution function F 
yields approximations for F on quadrants of the form [u\, oo) x [112, oo), where u\ and 
m are high thresholds. Here, a high threshold means that Fj(uj) is less than but close 
to 1, where Fj is the marginal distribution, j e {1,2}. From this approximation for F, 
approximations for interesting tail quantities can be derived. Since F is only approxi- 
mated on a subset of its support, care has to be taken when writing down the likelihood. 



The domain-of-attraction condition. We assume the existence of sequences of con- 
stants a„j > and b n j, j e {1, 2}, and a bivariate cumulative distribution function G with 
non-degenerate margins such that 

lim F n (a n \x\ + b n \,a n2 x 2 + b n2 ) = G(x u x 2 ) (2.1) 

for all continuity points (xi,x 2 ) of G. Here, G is called an extreme-value distribution 
function and is necessarily of the following form: 

• Its marginal distribution functions G\ and G 2 are univariate extreme-value dis- 
tribution functions, so 

-logGj(xj)= h+^-L-^L) , ye {1,2}, 

V (Tj I 

for xj such that cri + £j(Xj —fij) > 0, with shape parameter £ j (the extreme-value 
index), location parameter fij, and scale parameter cry > 0; 
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• Its dependence structure is given by 

- log G(x, y) = t{- log Gi (jc), - log G 2 (y)), (2.2) 

for all (x,y) e B 2 such that G\{x) > and G2(y) > 0, the tail dependence 
function ( admitting the representation 

£(s, t) = 2 J max(ws, (1 - w)f) (s, f) e [0, oo) 2 , 

J[0,1] 

the spectral measure H being a probability measure on [0, 1] with mean equal to 
J wH(dw) - 1/2. (Quite often, the name "spectral measure" is reserved for the 
measure 2H.) 

Given a pair of large thresholds, u\ and u-i, it will be convenient to rewrite the 
marginal distribution functions using a different parametrisation which incorporates the 
thresholds in the expression of these functions. More precisely, letting rjj = (gj, (j, crj), 
j e {1,2}, the margins G\ and G2 can be written as 

-log6\(.v ; //,•)- <,{l + ^r L ^r) ' • ye {1,2}, (2.3) 

for Xj such that crj + %j(xj - uj) > 0, where is again the extreme-value index, crj > 
is a scale parameter, and < £ ) = — log Gj(iij \ rjj) s 1- Gj(u j \ rjj), the marginal 
probability of exceeding the threshold Uj, Therefore, a bivariate extreme-value distri- 
bution function G is parameterised by its marginal parameter vectors 771 and 772 and its 
spectral measure H. From now on, we shall make this explicit. Such a cumulative dis- 
tribution function {x\,x<2) h-> G(x\,X2 \ H,r]\,r\i) is absolutely continuous if and only 
if the restriction of the spectral measure H to the interior (0, 1) of the unit interval is 
absolutely continuous. Still, the spectral measure is allowed to have atoms at and 1, 
so as to include, for instance, the case of independence where H — ^(6q + £1), <5 U being 
the Dirac measure at to. 

In statistical practice, the spectral measure is often modelled parametrically. In this 
article, however, we model H nonparametrically for maximum flexibility. 



The tail approximation. The tail approximation for F boils down to stipulating that 
for large thresholds u\ and u%, and for (x\ , x%) 6 [u\, 00) x [U2, 00), the form of F(x\ , X2) 
is that of a bivariate extreme-value cumulative distribution function. The justification 
comes from equation ( |2.1[ ) and the fact that extreme-value distributions are max-stable: 
for t > 0, the function G' is also a bivariate extreme-value distribution function and it 
differs from G by location and scale only. Thus, we assume that for (x\, X2) e [mi, °°) X 
[U2, 00 ), F(xi,X2) ~ F(xi,X2 I H, 771,772), where F(xi,X2 \ H, 771,772) has dependence 
structure given by ( |2.2| i, for some spectral measure H, and marginal distributions given 
by ( |2.3| >, for some parameter vectors 77! and 772. 

Furthermore, we assume that the approximant F(x\,X2 \ H, 771,772) is absolutely 
continuous on the region (jci , xi) e \u\ , 00) x [«2, °°)- Its density 

d 2 

f{x\,x 2 I H, t]i, 772) = - — —F(x u x 2 I H, 771,772), xi > u u x 2 > u 2 , 

OX \ OX2 

is a (rather complicated) function of the six marginal tail parameters given by 771 and 



772 together with the spectral measure H. It is derived in Appendix A. 1 
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The censored likelihood. The tail approximation to F is only defined on the region 
oo) x [u 2 , oo). Therefore, a delicate task is to define the likelihood contribution of 
a datum (xi,x 2 ) which may or may not fall into this region. We adopt a censoring 
approach, that is if Xj < Uj, then we pretend that Xj is censored by uj. Specifically, let 

(X* V X* 2 ) = (Xi V u u X 2 V m 2 ) and d = (l^^i), l[„ 2 ,oo)(X 2 )). If X* = x\ and X\ = x* 2 , 
define 

F(u u u 2 \H,T]i,m) ifrf = (0,0), 



/*(xj,X2 | H,7]i,r] 2 ) 



d 

—F(x u u 2 | H,jjum) if </ = (1.0), 

^-F(uux 2 \H, m , m ) if</ = (0,l), 
0x2 
3 2 

F(x u x 2 \H, m , m ) if<* = (l,l). 



<9x i3x2 



The exact expression for /* is given in Appendix A.l Let X 
1, . . . , «}, be a random sample from a bivariate cumulative distribution function F, with 
corresponding censored sample X* = {(X* : ,X* : ) : i = l,...,n). The likelihood is 
defined as 



m.,X* 2i ) : i 

L(H, m , m 1 r) = f]f(4,r 2i 1 H, m , m ), 



(2.4) 



1=1 



which depends implicitly on the thresholds U\ and u 2 . 



3 Modelling the spectral measure 

Within a fully Bayesian framework, we need not only put a prior on the marginal tail 
parameter vectors 771 and 772, but also on the set ,3$° of spectral measures, that is the 
set of probability measures on [0, 1] with mean 1/2. Let J^ c be the set of spectral 
measures whose only atoms, if any, are at or 1. Note that the distribution functions 
of such spectral measures are continuous on (0, 1). For notational convenience, when 
H is a probability measure on [0, 1], we denote H(yv) := H([Q, w]), for all w e [0, 1]. 

In this section, we construct a subset Jtf" c Jff of spectral measures that is dense 
in Jf? c with respect to the topology of uniform convergence of cumulative distribu- 
tion functions. The set Jff" takes the form of a countable union of finite-dimensional 
parametric families. The selection of a prior distribution on Jif" is discussed in the 
following section. 

The construction of ffl" is done in two steps: the approximation step (Subsec- 
tion 3.1 1, in which a dense class of discrete spectral measures is proposed, and the 
smoothing step (Subsection |3.2[ ), in which the distribution functions of these discrete 
measures are smoothed via monotone cubic splines. 

3.1 The approximation step 

It is well-known that the set of discrete spectral measures is dense in the family of all 
spectral measures with respect to the topology of weak convergence. The objective of 
this subsection is to present an explicit construction of such a discrete approximant to 



a spectral measure in ^ (Lemma 3.1 and Proposition 3.2 1. The approximating spec- 
tral measure has the same atoms on and 1 as the original measure and in addition, 
for a given m > 1, it has exactly m atoms in (0, 1) that all receive the same mass. 
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The approximation error, in terms of the largest distance between distribution func- 
tions (sup-norm), is bounded by 1 /m. In the next subsection, these approximants are 
smoothed out, yielding the family Jff" of spectral measures mentioned above. 

Lemma 3.1. For any cumulative distribution function <E> of a random variable W on 
[0, 1] and for any integer m > 1 there exists a cumulative distribution function *F of a 
discrete random variable Y on [0, 1] supported on at most m points such that 

E(W) = E(F) and sup |$(w) - ¥(w)| < l/m. (3.1) 

we [0,1] 

Moreover, if<$> is continuous on [0, 1], then Y can be uniformly distributed on m points. 
Proof. Let O -1 : [0, 1] — > [0, 1] be the generalised inverse of <1>, that is 



and let 



<& _1 (0 = inf{w e [0, 1] : <D(w) > ?}, for all t e [0, 1], 



qt = (D (i/m), i = 0, . . . ,m and q m+ \ - 1. 



There exist m points < 1 1 < to < . . . < t m < 1, which solve the following equations 

f* (o(w)- — )dw = — -, z=l,...,m. (3.2) 
J 9i _, v '« 7 m 

Note that < f; < for i 6 {1, . . . , m}. Let 5 = {y\, ■ ■ . ,yk\> where y\ < yi_ < . . . < 
are the k ^ I distinct values among t\, . . . ,t m , and consider the set of multiplicities 
M — \m\, . . . ,mit} given by 

nti = #{j : tj = yi, j = 1, . . . , m}, for all ; e { 1 , . . . , k). 

Now, let Y be a random variable on 5 with P(T = y,) = m, /m, for all / e {1, . . . , k). If 
¥ is the cumulative distribution function of Y, then we have 

rli rli 

<D(w)<iw = x f(w)dw, for all i = l,...,m + l, (3.3) 

•Jqi-i Jq,-i 

and this implies 

m-t-i /-.^ 

2/ 

m+1 pi 

Yj \ y(w)dw= I v P(w)rfw = E(F) = ^ 



Jol m+1 
0(w) <iw = ^ I 0(w) rfw 



Moreover, for w e [0, 1], if w 6 [q m , 1], then O(vv) = *P(w) = 1, while if w e [0, #„,), 
then there exists i e {1, . . . , m), for which w e [5,-1, (ft), so that 

/ — 1 / z — 1 i 

< O(w) < — and < < — . 

m m m m 

It follows that |cD(vv) - ^{w)\ < 1 /m, for all w e [0, 1]. Finally, if O is continuous, then 
^fe) = i/wi> for a ll i e {0, . . . , m), A: = m, and 

= qo < y\ < q\ < yi < ■ ■ ■ < q m -\ < y m < *« < !• 

See Figure[T]for an illustration. □ 
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Figure 1: Illustration of equation ( |3.3[ ) with m = 4, i = 2, and <& = Beta(3,3). The area 
of the shaded region is equal to the area under <t> between q\ and q^. 



Proposition 3.2. If H e 34? c , ho = H({0}), hi = H({1}), and m > 1, then there exists 
a discrete spectral measure H* with H*({0}) — ho and H*({1}) — hi and, in case 
ho + hi < 1, with exactly m atoms in (0, 1), such that 



sup \H(w) - H*(w)\ < (1 - ho - hi)/m. 

we [0,1] 



(3.4) 



Proof. First, notice that < h ,hi < 1/2. Now, h = 1/2 if and only if hi = 1/2, 
in which case H is Bernoulli(l/2), and the result follows by setting H* = H. If < 
ho, hi < 1/2, then there exists a continuous cumulative distribution function (D of a 
random variable W on [0, 1] such that we have the representation 

H(w) = h 6o(w) + (1 - h - hi) <5(w) + h\ 6i(w), for all w 6 [0, 1], 

where d u {w), w e [0, 1], is the cumulative distribution function of the Dirac measure 
at ai. By Lemma [3~T| there exists a cumulative distribution function *P of a discrete 
random variable Y, uniformly distributed on \y\, . . . ,y m ), with < yi < y^ < ■ ■ ■ < 
y„, < 1, such that E( W) = E(Y) and sup w |<D(w) - ¥(w)| < 1/m. The result follows by 
considering the discrete spectral measure H* defined by its distribution function 



H*(w) = h 6o(w) + (l-ho- h) ^(w) + hi 6i(w), for all w e [0, 1]. 



□ 



3.2 The smoothing step 



The second part of the construction of M" is motivated by Proposition 3.5 below. 



which shows that a uniform approximation bound may also be obtained from finite- 
dimensional families ,3V m , m > 1, of spectral measures that are absolutely continuous 
on (0,1). 

Here we construct Jtf? m , for all m > 1. The geometrical construction is best un- 
derstood by occasionally referring to Figure [2] First, note that in Proposition 3.2 the 
approximating discrete spectral measure H* depends on < ho, hi < 1/2, and, pro- 
vided ho, hi < 1/2, on points < y\ < . . . < y m < 1 satisfying (1 —ho — hi)y + hi = 1/2. 
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In view of this, for m > 1, let 

©m = ®m,hoM (3-5) 

(ft ,/ii)e[0,l/2] 2 

where, forO < /io,/ii < 1/2, 

©m.fto.A, = \ho) x |(yi, . . . ,y m ) : <yi < • ■ • < y m < 1, y = — 1 x f/ii}, 
[ 1 - /z - «i J 

and where m j/2,i/2 = {(1/2, 1/2)}. The proof of the following lemma is straightfor- 
ward. 

Lemma 3.3. If 9 - (ho,y\, . . . ,y m ,h{) e ® m , and if ' H* g is the discrete probability mea- 
sure with atoms H g ({0}) — ho, H g (\yj}) = (1 — ho — h\)fmfor all i e {1, . . . ,m), and 
Hg(W) = hi, then H g is a spectral measure. 

The next lemma is used to define a mapping of m into the space of spectral mea- 
sures which are absolutely continuous on (0, 1). 

Lemma 3.4. If 6 — {ho,y\, . . . ,y m ,hi) € ©,„ and if ' H g is the discrete spectral measure 
given in Lemma 3.3 then there exist two probability measures on [0, 1], Hg _ and Hg,+, 
which are absolutely continuous on (0, 1), and such that 

H e -{w) < H* g (w) < H g ,+(yv), for all w e [0, 1], 

sup \H m g (w) - H e , ± (w)\ < (1 - h - hi)/m. (3.6) 

we [0,1] 

Proof. Following Fritsch and Butland ( 1984), we construct two continuously differen- 
tiable monotone cubic splines <pg _ and <pg^ + for which 

i- 1 

<Pe,-(yd = ho + (1 -h - h) , i e {l,...,m}, 

to 

with <p 9 -(0) = <pg-(yi) and lim u . T1 <p e ,-(w) = ipe-fym), and 

i 

<Pe,+(yd = «o + (l - «o - «i)— . i 6 {!,-• • ,»*}, 

to 

with ^e j+ (0) = <fe,+(yi) and lim w fi ^e,+(w) = (pg r+ (y m ). The details of the construction 
are given in Appendix |A.2| Now, let 

He,±(w) = (pe,±(w), for all w e [0, 1), and Hg t± (l) = 1. 

We have Hgjw) < H*(w) < # e ,+(w) for all w e [0, 1], as well as ( pTg) , □ 

The mapping of © m into the space of spectral measures which are absolutely contin- 
uous on (0, 1) is defined as follows. Let 9 e ©„,. In the trivial case where ho — 1/2 = hi, 



let Hg be a Bernoulli(l/2). If < ho, hi < 1/2, then from Lemma 3.4 we get 

a_ = I //fl_(w)t/w< I Hg(w) dw — 1/2 < I //g + (w) c/w = a + . 
Jo Jo Jo 

Now let 

a + - 1/2 

a=— — and H e = aH 6 - + (1 - a)H e+ . (3.7) 

a + — a- 

Clearly, in any case, Hg is a spectral measure, and it is absolutely continuous on (0, 1). 
This, in turn, enables us to define - {Hg : 9 e ©„,}. Figure |2]below illustrates the 
following proposition. 
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Figure 2: The illustration shows the distribution functions from Lemma[34]and Propo- 
The step function is H* 



sition 
is Ha 



3.5 



the dashed line is Hg and the dash-dotted line 



The thick line is the true spectral measure H from Proposition 3.5 in which 
H* = H*, and the dotted line is the smooth H'. 



Proposition 3.5. If H is as in Proposition \3.2\ < ho, hi < 1/2, and m > 1, then there 
exists H' G M'm such that 

sup \H(w) - H'(w)\ < 2(1 - h - hi)/m. (3.8) 

W€[0,1] 



Proof. By Proposition 3.2 there exists a discrete spectral measure H* - Hg defined 



by 9 — (ho,y\, . . . ,y m , hi) € 0,„ such that ( |3.4| > holds. Now, let Hg - and Hg t+ be as in 
Lemma [33] and take H' - Hg as in p.7). In view of ([3.6), the bound in (|3.8) holds. □ 



Finally, in order to allow for maximum flexibility in the model, we take 



m> 1 



as the space of spectral measures on which we shall concentrate a prior. This is done 
in the following section. 



4 Bayesian framework 

The model approximating the tail of the bivariate distribution F is specified via a spec- 
tral measure which belongs to M" and marginal parameters (771, 772) e H 2 , where 

a = (-00, 00) x (o, 00) x (o, 00). 

The parameter space for 3tf" is given by = Um>i ®m» with ®m as m ( [3.5) . Thus, 
the parameters defining a spectral measure consist of a model index m and, given the 
model, a parameter vector 6 = (ho,yi, . . . ,y m ,h\) e m . We assume a priori that 7/1, 



772 and (m, 6*) are independent, and in Subsection 4. 1 we select priors for each of these 
quantities. 
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We develop the Bayesian framework in the same spirit as that of model selection, 
see for instance Robert (2007 ). Let X — {(Xi 1,^21), . . . , {X\ n ,X2 n )} be a random sample 



from F, let X* be the corresponding censored sample, and assume for the moment that 
the priors have been selected. The joint posterior for the tail of F is given by 

7r(H mg ,rii,T]2 I X*) cx L(H m , g ,rii,m I X*)n(9 \ m)n{m) ni{rj\)7i2 (%). 



where L is the likelihood given by equation ( |2.4| l. In Subsection 4.2 we give an MCMC 
algorithm which is used for numerical computations in every inference procedure that 
we propose throughout the rest of the paper. 

In particular, for each m > 1, let n{m \ X*) be the posterior probability of selecting 
model m. We define the Bayes estimator for the tail of F as the mixture 

(H, f, u fj 2 ) = J] ^ m I X*)(H (m \ fff, f,f\ (4.1) 

where {H {fn \ ff" , ?)! ) is the ^ 2 -Bayes estimator inside model m. The estimator ( |4.1| i 
is evaluated numerically via the sample mean of the trans-dimensional Markov chain 
constructed in Subsection [< 



4.1 Selection of the prior 

Here, we do not claim any originality in the selection of the prior for the marginal 
parameters as we simply choose the maximal data information (MDI) prior proposed 
in |Beirlant etalj ( |2"0"()4"] l: 

tt/tjj) = exp{E[log fj(X | t,j)] } cx k. eX p{-( 1 + &)( r + log £,)}, j e {1, 2}, 

where fj is derived from ( |2.3| l and y = 0.577215 ... is Euler's constant. This choice 
is motivated by the fact that it is an objective prior which is simple to implement. 
Alternatively, export knowledge on certain marginal return levels may be incorporated 



in the prior as in|Coles and Tawn ( 1996 ). 



The main novelty is the selection of the prior on the spectral measure H„ u g via priors 
on m and 9. In particular, we need to specify a prior for the model index m, and for 
every m in its (infinite) support, we need to select a density for 9 e © m with respect 
to an appropriate reference measure. Selecting a density for 9 is a delicate problem as 
such a choice can give rise to an intractable normalising constant needed in evaluating 
the estimator ( |4.1| l using a reversible jumps algorithm as we do. 

First, we assume that m is drawn from a 0-truncated Poisson(zl) distribution, n{m) = 
A m l(e A - \)m\ for m > 1. Next, inside each model © m , m > 1, we consider an appro- 
priate Hausdorff reference measure. More precisely, for a positive integer k, let fik be 



the ^-dimensional Hausdorff measure, see for instance Billingsley ( 1986 chapter 19). 
The set ©„, is an (m + l)-dimensional surface in It m+2 , with < ju m+ i(® m ) < 00. We 
assume that the parameter vector 9 - (ho,yi, ■ ■ . ,y m , h\) is uniformly distributed on ©„, 
with respect to /i m+1 : 

n{6\m)= ——, forall0s0 m . 

Because of its trans-dimensional nature, implementation of the MCMC algorithm 



in Subsection 4.2 requires the exact knowledge of n(9 \ m) and thus of the normalising 
constants fi m+ i (©,„), for all m > 1. These normalising constants are given by the 
following result. 
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Lemma 4. 1 . For all m > 1, we have 

-1/2 r l/2 



I Mm-l(®mA),A, ) dho dh u (4.2) 
o Jo 



with 



fi m -i(®mMM) ~ m ]( m -i)\\i - ho-h!) P \ Y(m) < 7*1/2 -A,)]' 

where F( m ) ;s ?/ze maximum of Y - (Y\ , . . . , F m ), anc/ Y is distributed according to a 
Dirichlet(l,. . . ,1), that is the uniform distribution over the unit (to — 1)- dimensional 
simplex. Furthermore, 



m-l 



where K is the greatest integer less than m(l /2 — h\)/(l — ho — h\). 

Proof. For m — 1, we have /i2(@i) = 1 /4, and the result holds. For m > 1, let 

®Ua = 0o) x |(y 1; . . . ,y m ) e (0, 1)"' : y« = fT^T^ } x ^ 

so that /i m _i(©^ A A ) = m! /i m -i(©mAiJii)- By considering the change of coordinates 
jc; = [(1 — ho — h\)/m(l /2 — hi)] yt, i — I,. . . ,m, let 



L i - - ^i v 

m(l/2-hi) f-f 



we get 



yfmdy\---dy m -\, (4.4) 



X 



0' , , 



l m(l/2-hM m - 1 C 
^[l-ho-hj J s 



m-l 



a/to (m(l/2-h 1 )\ n 1 / \-h () -hi 

" I Mm) < 



(m- 1)! \ 1 -ho-h ) \ K ' mil/2-h)) 

where F( m ) is the maximum of Y = (Fi, . . . , F,„), and F is uniformly distributed over the 
standard (to- l)-dimensional simplex. Note that the term -\/m in ( |4.4| i is the value of the 
(generalised) Jacobian which appears as the integrand when expressing the Hausdorff 



measure of a smooth surface in terms of the Lebesgue integral, see Billingsley ( 1986 
p. 253). Figure[3]illustrates the above quantities. Finally, equation ( |4.3| l is derived from 
a result which can be found in Fisher ( 1929), namely 

Li/vJ / x 

P(Y ( ,n) <y) = i - Ec-d* -1 ^ 1 _ ky)m ~ l > for * ]ly 6 [1/m ' 1] - 



Note that for to > 1, the double integral in ( |4.2| i is to be evaluated using numerical 
quadrature. 
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Figure 3: Illustration of ju m -i(0 m A,,/iiX form = 10, (ho, hi) e [0, 1/2] 2 . The amplitude 
has been rescaled by a factor of 3.75 X 10 7 . 



4.2 The MCMC algorithm 

We now give the reversible jumps algorithm used for numerical computations. Recall 
the decomposition = Um»i ®w At each iteration, the spectral measure of the pro- 
posed candidate can either belong to the same space 0,„ as the spectral measure of the 
current state, or to m _i or to m+ i. Our objective here is to outline the algorithm and 
to give the geometrical intuition behind each move. The technical details are given in 



Appendix A. 3 



Let J* = {0, 1, . . . ,m + 1} be the set of indices of the components of the parameter 
vector 9 = (Oo, • ■ ■ , # m +i) = (ho,y\, ■ ■ ■ ,y m , hi) £ 0m- At every iteration 1 < t < T, 
where T is the length of the chain, choose between Move 1 and Move 2 with equal 
probability p\ = 1/2 = p2- 

Move 1: Propose a new spectral measure. 

Select between the three following moves with equal probability when 
m > 1. In case m = 1, only the moves 1.1 and 1.2 are considered. 

1.1 Propose a candidate in m . Select (i, j), i < j e with probability 
l/( m 2 2 ), and fix k , for all k + Propose a random move for the 
couple (6 h 6j) such that the resulting vector 9' remains in m . 

1.2 Propose a candidate in m+ i. This is done by inserting a new y' 
inside (0, 1), and by moving y^, a randomly selected point among 
yi, . . . ,y m , inside (0, 1) so that equilibrium is preserved, that is y' - 
(1/2 - hi)/(l - h Q - hi) = y, where / = {y' v y' 2 , . . . ,y' m+i ] is the 
proposal. 
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1.3 Propose a candidate in 0,„_i. This is the inverse of move 1.2. 
First, select two points yj < y < y^, randomly among )>i,}>2>- - • ,ym- 
Then, remove the one that is closest to the equilibrium point y = 
(1/2 - h\)/(l - ho - hi) and move the remaining one inside (0, 1) 
in such a way that equilibrium is preserved, that is f - y, where 
f = \y' v ■ ■ ■ >y m -i] is the proposal. 

Move 2: Propose a new margin. 

Here, a new value r(. = (^, a*) e Hj, for either j = 1 or 2 is proposed. 

The above algorithm generates an ergodic Markov chain on which can be used, 
in particular, to approximate the Bayes estimator ( |4.1[ ), by taking the sample mean of 
the generated spectral measures. Here, we use it to illustrate the prior on the spectral 
measure distribution functions H m g induced by n(6,m) = n(8 \ m)n(m) specified in 
the previous subsection. Figure |4] below shows a region S such that for every vertical 
section S(w), < w < 1, we have P[H(w) e S(w)] = 0.95 a priori, together with the 
prior pointwise mean E[H(w)]. 




0.2 0.4 0.6 0.8 1.0 



Figure 4: A region S with P[H(w) e S(w)] = 0.95 a priori for every w e [0, 1), 
together with the prior pointwise mean E[H(w)]. 



5 Examples 

In order to provide some validation for our methodology, we first apply it to artificial 
data in Subsection |5.1| Next, in Subsection |5.2| we analyse a dataset of Danish fire 



insurance claims, a univariate version of which has been previously studied in McNeil 
( f!997) and |Resnick| ( [T997l l. 

5.1 Artificial data 

We consider the following bivariate cumulative distribution function: 

F r (x u x 2 ) = (l- — Yl - — Yl + — — 1 jci.Jfc > 1; < r < 1. 

V X\ l\ Xo /\ Xi + X? I 
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For any value of the dependence parameter r, both marginal distributions are unit 
Pareto, which has cumulative distribution function G(x) = 1 — l/x, for x > 1. 

The spectral measure H, of the extreme-value attractor of F r has atoms H r ({0}) = 
H r ({l}) = (1 - r)/2 and a constant density h r (w) = r,w e (0, 1), that is, 



H r (w) 



\ l -^+rw if0«w<l, 
II ifw = l. 



see e.g. Einmahl and Segers (2009). We can simulate from F r using an Accept-Reject 
algorithm. Here, for every r e {0.2,0.4,0.6,0.8}, we simulated 1000 samples each of 
size n — 1 000. The thresholds u\ and U2 were set at the 90th percentiles of the marginal 
samples. Figure [5]shows the Bayes estimates of H r , for every r. 





(a) 



Ho.2 



(b) 



H .A 





(c) 



ffn 



(d) 



ffo.8 



Figure 5: The median of 1 000 Bayes estimates (thick line), along with a 95% (point- 
wise) confidence band (grey region) for the true cumulative distribution function of the 
spectral measure H, (dotted line), for r - 0.2, 0.4, 0.6, 0.8. 



5.2 Danish Fire Loss Data 

The data set comprises 2 167 industrial fire losses and was collected from the Copen- 
hagen Reinsurance Company over the period 1980 to 1990. We are indebted to Alexan- 
der McNeil (Heriot-Watt University) for making these data available through his per- 
sonal homepage on http://www.ma.hw.ac.uk/~mcneil/data.html The com- 
pany's figure for compensatory damage is divided in three categories, namely damage 
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to building (Xi), damage to furniture and personal property (X2) and loss of profits due 
to the incident (X3), see Figure[6] 



Mr- 



100 120 140 



20 40 



100 120 140 



(a) Damage to building (Xi ) vs Damage to furniture (b) Damage to building (Xi ) vs Loss of profits due to 
and personal property (X2) the incident (X3) 



(c) Damage to furniture and personal property (X2) 
vs Loss of profits due to the incident (X3 ) 

Figure 6: Pairwise scatter plots of loss claims. 



In McNeil ( 1997| l and Resnick (T997}, the data-set is analysed as if it is univariate, 
by combining the three categories into one loss figure: X\ +X2 +X3. However, the three 
types of loss compensations involve different portfolios, and in view of this, it is in the 
insurance company's interest to know the dependence among these types of compen- 
sations. This is where our methodology becomes useful. Essentially, we are interested 
in the rare-event probabilities and the extreme conditional quantiles described in the 
introduction. Here, these tail quantities are derived respectively from the restriction on 
the region [u 1 , 00) x [U2, 00) of the joint predictive density 

f*(x\,x* 2 I X*) = V I f(x\,x* 2 I H mfi , 771,772) n{H mfi , 771,772 | X*) dQdr\ x drj 2 , 

(5.1) 

and the conditional predictive density 



f*(x* u x* I X*) 

/ 2|1 C* 2 I ) - f;(x * lXt) > 



(5.2) 



where X* = {(^i,-, X* 2t ), i - 1, . . . ,2167), is the censored sample. 
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A univariate analysis of the data led us to choose the 90th sample quantiles for the 
thresholds u\ and u%. For descriptive purposes, we now investigate the dependence 
structures for all pairs of compensatory damage categories. The Bayes estimates and 
95% pointwise credibility sets for the cumulative distribution functions of the spectral 
measures are given in Figure [7] 




(a) Damage to building (X\ ) vs Damage to furniture (b) Damage to building (X\ ) vs Loss of profits due to 
and personal property (X2) the incident (X3) 




(c) Damage to furniture and personal property (X2) 
vs Loss of profits due to the incident (X3 ) 

Figure 7: Bayes estimate (thick line) and 95% pointwise credibility sets for the cumu- 
lative distribution function of the spectral measure for each couple of compensatory 
damage category. 

We focus on the first couple X\ and X2. Figure |7|a) clearly indicates that the two 
variables are not independent in the region \u\ , 00) x [112, 00). The joint predictive den- 
sity ( |5.1| l is shown in Figure [8] Finally, the mean of the conditional predictive den- 
sity ( |5.2| >, that is the predicted value of the claim X2 given the claim X\ = x\, and the 
95% quantile of the predictive conditional distribution are shown in Figure [9] 

6 Discussion 

We have provided a nonparametric Bayesian framework for bivariate extreme-value 
analysis. On the one hand, the nonparametric nature of the dependence structure 
(spectral measure) is fully respected. On the other hand, for purposes of prediction 
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9 10 



0.0005 0.0010 0.0015 0.0020 0.0025 0.0030 0.0035 0.0040 0.0045 



Figure 8: Image of the joint tail density \5.\\ in the region [u\, oo) x [U2, °°). 




Figure 9: The data (points), the predicted value of X2 (black line) given the claim 
Xi, along with the 95% pointwise quan tiles (grey line) of the conditional predictive 
density (|5.2[). The data are also plotted. 
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of future high levels (even conditionally), the predictive distribution incorporates both 
process and estimation uncertainty. Actual computations are performed using a trans- 
dimensional Markov chain constructed via a Metropolis-Hastings algorithm. Software 
written in (parallel) C++ wrapped in a Python environment may be obtained from the 
authors. 

Conceptually, it is not hard to see how to generalise the approach to arbitrary di- 
mensions. Practically, however, there are some serious obstacles to be overcome. First, 
in dimension d, the censored likelihood branches out in 2 d parts. Moreover, the spec- 
tral measure sits in the exponential of the bivariate extreme-value distribution, and 
computation of this distribution's density becomes tedious. Second, in dimension d the 
spectral measure may be an arbitrary probability measure on the unit simplex in R d 
satisfying a certain number of mom ent constraints. It may have a density on each of 
the 2 d ■ 



1 faces of the unit simplex ( Coles and Tawn 



1991 ). These densities are to be 



specified via splines or polynomials or some other set of basis functions. A prior needs 
to be specified on a manageable but still dense submodel of spectral measures. Then, 
efficient MCMC methodology should be proposed for numerical computations. 

Finally, the bivariate tail approximation via extreme-value distributions is not de- 
signed to deal with asymptotic independence, in which case the tail approximation 
degenerates to exact independence, an approximation which may be unsatisfactory for 
instance in case of the bivariate Gaussian distribution with a high correlation. A dis- 
tributional model encompassing both asymptotic independence and dependence has 
been proposed in Ramos and Ledford (2009), based on Ledford and Tawn ( 1996) and 
|Ledford and Tawn ( 1997| l; see also jResnick (2003) and Beirl ant et al.| ( |2004| l. In this 
framework, dependence is characterised by a parameter r\ e (0, 1] and an angular mea- 
sure H on (0, 1) whose total mass may be infinite and which satisfies a certain moment 
constraint. Yet another framework to deal with conditional extremes and asymptotic in- 



dependence is proposed in Heffernan and Tawn (2004 1; see also Heffernan and Resnick 



(2007 1 and jResnick ( 2008| l. Again, extremal dependence is described by a possibly in- 
finite spectral measure. 
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A Appendix 

A.l Censored likelihood 

We give explicit formulas for the censored density /*. Let 

Fj(xj) = exp + fj Xj ~" J ) ' f , j e {1,2}, 

so that 

/,(,,) = = ^(l + 6 ^ ') j < (1,2), 

for > «j such that cry + gj(Xj - Uj) > 0. Let 

F(x u x 2 ) = exp{-€(-logFi(xi),-logF 2 (x 2 ))}, 

where 

£(s, f) = 2 f max(ws, (1 - w)f) #(dw), (j, t) e [0, oo) 2 . 

J[0,1] 

Let h(w) = ^H(w), w e (0, 1), be the Radon-Nikodym derivative of H with respect to 
the Lebesgue measure on the open unit interval. We have 

f wh(w)dw = l/2-H({l}) and f (1 - w) h(w) dw = 1/2 - H({0}). 
Jo Jo 



Furthermore, 

#({!})+ I wh(w)dw\, 



os 
d 

-€(s,t) = 2 
at 



J wh(w)dw^, 



v 

//({())) - J' (I - \v)h(u-) ( hv\. 



St I t \ 

-€(s,t) = -2- -.hi , 

f (s + t) 3 \s + t) 



dsdt ' (s + t) 3 

and we have the representation 



■t q2 

s,t) = s + t+ I I - — —£(a, t) dcrdr, (s, t) e [0, oo) 2 . 
' da- ot 



Jo Jo 

Finally, f*(x\, x*) is equal to: 



F(u l ,u 2 ) if Xi < Mi, X2 < U2, 

-i/ft-i ^ 



(l+fl^f 1 ) — ^(-l0 g Fi(Xi),-l0gF 2 (M 2 ))F(Xi,M 2 ) ifXi >U lt X2<U2, 

(l+fc^j -^(-l0 g Fi(Mi),-l0gF 2 (x 2 ))F( Ml ,X 2 ) ifXi <U U X 2 >U2, 



where 



ft. 



A^(-logFi(xi), -logF2(x 2 ))F(xi,x 2 ) if x\ > Mi, x 2 > « 2 , 



A£(j,0 = ^-l(s,t)jl(s,t) - ^Q t ^,t). 
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A.2 Construction of the interpolating monotone cubic spline. 

Let = yo < y\ < y% < • • ■ < y m < y m+ \ — 1 be the set of abscissas, and let 
tpo < (fi < <f2 < • • ■ < <p m < fm+i be the set of ordinates. We construct a piecewise 
cubic polynomial function <p(-) such that 

<p(yd = fi' f° r a U i = 0, . . . ,m + I, 

and such that (/?(•) is nondecreasing and continuously differentiable on (0, 1). 
On every interval [y,,y, + i], i = 0, . . . , m, we expand tp around y, and we get 

di + d i+ \ - 2A; 3 -2di - d i+ \ + 3A,- 2 

WO = (y - y;) + ; (y - yd + dAy - yd + ipi, 

hf hi 

for all y e [y,-,y,+i], where A,- = {tp M - ipdlh, hi = y M - y h and where d t and d M 
are the endpoint derivatives. Thus, the construction of the spline depends only on the 
specification of the set of endpoint derivatives < d§,d\,dz, - ■ ■ ,d m ,d m+ \. |Fritsch| 
and Carlson (1980) give necessary and sufficient conditions on these values in order 



to guaranty monotonicity throughout [0, 1]. For instance, setting di = 0, for all i = 
0, . . . , m+1, produces a continuously differentiable nondecreasing interpolant, although 
the resulting curve is not very smooth. On the other hand, Fritsch and Butland ( 1984[ > 



propose a method for determining the endpoint derivatives which produces smoother 
curves. In fact, it suffices to set do = = d m +\ and 

C A,-_iA,- -, AA A 

— — if A,-_i A,- > 0, 

di = I orAf + (1 - or)A f _i 

[0 otherwise, 
where a = \{l + j^^), for i = l,...,m. 

A3 Implementation of the MCMC algorithm. 

Move 1: Assume that the current state is 6 — (6q, . . ., 9 m+ \) = (/io,yi, • • ■ ,y m , hi) e @,„, 
and let y = (1/2 - h\)/(l - ho - h\). Let p\(m) = p2(m) = p-iim) = 1/3, for all 
m > 1, and p\(m) — 1/2 = pi(ni), p->,(m) — 0, if m — 1. Select move 1,2 or 3, with 
probability p\, P2 and respectively. For a real interval /, let a tt (I) denote the uniform 
distribution on /, and let A(I) denote its length. 

1.1 Propose a new 6' €® m . 

1.1.1 Select (z, f),i< j 6 J = {0, 1 m+ 1}, with probability l/i^f). 

(a) Case i = 0, j = m + 1, Draw u ~ (/o, m +i), where 

/ / -l/2 + (l-/io)y | \ 
k,m+i = I max i-ho, > , 1/2 - h 1 , 

set h' - ho + u and h' x = 1 ^ 2 " ( ]~-°"" )> . 

(b) Case i = 0, 1 < j < m. Draw u ~ <%f (Iqj), where 

-(1 -h -hi)yj} 



Ioj = {max i-ho, 



set h' Q — ho + u and y* = 



my - yj 



. (l-h-hKl-yj) 
mm < 1/2 - ho, 



my + 1 — yj 

m( 1 /2— /i[)-(l— ho-hi -u)(my-yj) 



})• 



l-/ln-/Zi -H 
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(c) Case 1 < ; < j < m. Draw u ~ (7 (J ), where 

Iij = (max\-yi,yj - 1), - - y ' j , 

set y* =y t + u and y* = y,- - m. 

(d) Case 1 < i < m, j = m + 1. Draw u ~ % (/,>,+ 1), where 

7 (>+ i = |max{-y ( ,-my},min jl -yu-my + 2 (i"l h {) ) }) ' 

set y* = yi + u and = -^1/2-^(^1-^) ^ 

1.1.2 The above operation yields a new, possibly unordered vector (y*, . . . ,y* m ). 
Sort (yf, ...,y* m ) into ascending order and denote the resulting vector as 
(y\ , . . . , y' m ), where < y[ < ■ ■ ■ < y' m < 1 . Denote the proposed param- 
eter vector by ff, so that ff = (h' ,y[, . . . ,y' m , h\). 

1.1.3 Accept ff with probability: 

ff = L(H nhff , m , m \X*)g{Q\ff) 

L{H m ,e,m,m\x*)q(.ff\oy 

where q(- \ 9) = 1//K7,y) is the proposal density. 

1.2 Propose a new ff e m+ i. 

1.2.1 Select k e {1, . . . ,m}, with probability l/m. 

1.2.2 Draw h ~ ^(4), where 

( (0,y*A(l-y)) ify*<y, 
4= ((y,-l)v(-y),0) otherwise, (A ' 1) 

insert a new y^ +1 = y + u, and set yjj; = y* - m. 

1.2.3 Sort (jj, . . • into ascending order and denote the resulting vector as 
(y[, . . .,y' m+1 ), where < y\ <••• < y' m+l < 1. Denote the proposed param- 
eter vector by 6', so that ff = (ho,y[, . . ■,y' m+v h\). 

1.2.4 Let m + = #{k : y' k > y, k = 2, . . . , m + 1} and m_ = m + 1 - m+. Accept ff 
with probability: 

, _ L(H m+U ff, m ,ri2 I X*)p 3 (m+ 1) [l/(/n_ 
a( ' L(H nh6 , m , m \X*)p 2 (m)[l/m][l/A(I k )] ' 

1.3 Propose a new 0' e © m _i. 

1.3.1 Select j and such that < yj < y < y& < 1, with probability l/(m_ m+), 
where m + = : y^ > y, k = 2, . . . , m] and m_ = m - m+. 

1.3.2 Let w = (y - y/) A (y* - y), 

(a) if y - yj < y t - y, then set yjj; = y* - m and remove y j, 

(b) otherwise, set y* = y^ + m and remove y^. 

1.3.3 Sort (y*, . . . ,y^_ t ) into ascending order and denote the resulting vector as 
(y'j, . . . ,y' m+x ), where < y\ <••• < y' m l < 1. Denote the proposed param- 
eter vector by 8', so that ff = (ho,y[,. . . ,y' m _ v h\). 
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1.3.4 Accept 6' with probability: 



L(H m _ ue ,, m ,r] 2 \X*)p 2 (m- l)[l/(m- 1)] [1 / A(I jk )] 
L(//,„, e , 771,772 I X*)p 3 (m) [l/(m_m + )] 



where 



I Ij if y-yj >y k ~y, 
1 I h otherwise, 

and where Ij and 7j are given by ( |A. 1 [ >. 
Move 2: Assume that the current state is 771 = (£1 ,(\,cr\) and 772 = {£2, £2, 02) m S. 

2.1 Select jf 6 {1,2} with equiprobability. 

2.2 Draw Zk ~ ^K(0, 1), independently for k — 1,2, 3, set 



% = ?y + 0.01zi, 
£ = ^exp(0.01z 2 ), 
cr'j = crj exp(0.01z3), 



and denote the proposed vector 77^. 
notational convenience. 



c'y)- Also denote 77J = 77,, i + j, for 



2.3 Accept 77' with probability: 



L(H m , g ,Tf v rf 2 I X*)nj{rfj)q{r}j \ rf) 



L(H m ,e,VuV2 I X*)7tj(rij)q(rfj \ TjjY 



where q(- \ rjj), is the proposal density. 
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